Take-Home_Ex04: Rain, Hail or Shine: Unveiling Mysteries of the Sky

Working Document for Project Work

Author

Roger Chen

Published

February 24, 2024

Modified

February 24, 2024

1 Overview

In this take home exercise, we will

2 Data Preparation

2.1 Loading R Packages

In this take home exercise, the following R packages will be used:

The code chunk used is as follows:

pacman::p_load(tidyverse, ggplot2, gganimate, plotly, ggiraph, DT)

2.2 Importing Temperature Data

Changi will be selected for the weather station, and temperature chosen as the factor to be analysed. The data sets will be downloaded from historical daily temperature from Meteorological Service Singapore website,

Lastly, using the code chunk below, we will combine the five datasets into a single document, and save it as a new dataset.

Next, we will call the dataset “combinedTemp” into the environment.

data <- read_csv("data/daily_historical.csv")

2.3 Summary Statistics of Data

Using DT, we will display the dataset as a interactive table on html page.

DT::datatable(data, class = "display compact", style = "bootstrap5")
Note

The data table seemed to be in order.

str(data)
spc_tbl_ [329,156 × 13] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
 $ station                 : chr [1:329156] "Macritchie Reservoir" "Macritchie Reservoir" "Macritchie Reservoir" "Macritchie Reservoir" ...
 $ year                    : num [1:329156] 1980 1980 1980 1980 1980 1980 1980 1980 1980 1980 ...
 $ month                   : num [1:329156] 1 1 1 1 1 1 1 1 1 1 ...
 $ day                     : num [1:329156] 1 2 3 4 5 6 7 8 9 10 ...
 $ daily_rainfall_total    : num [1:329156] 0 0 0 0 22.6 49.6 2.4 0 0 0 ...
 $ highest_30_min_rainfall : num [1:329156] NA NA NA NA NA NA NA NA NA NA ...
 $ highest_60_min_rainfall : num [1:329156] NA NA NA NA NA NA NA NA NA NA ...
 $ highest_120_min_rainfall: num [1:329156] NA NA NA NA NA NA NA NA NA NA ...
 $ mean_temperature        : num [1:329156] NA NA NA NA NA NA NA NA NA NA ...
 $ maximum_temperature     : num [1:329156] NA NA NA NA NA NA NA NA NA NA ...
 $ minimum_temperature     : num [1:329156] NA NA NA NA NA NA NA NA NA NA ...
 $ mean_wind_speed         : num [1:329156] NA NA NA NA NA NA NA NA NA NA ...
 $ max_wind_speed          : num [1:329156] NA NA NA NA NA NA NA NA NA NA ...
 - attr(*, "spec")=
  .. cols(
  ..   station = col_character(),
  ..   year = col_double(),
  ..   month = col_double(),
  ..   day = col_double(),
  ..   daily_rainfall_total = col_double(),
  ..   highest_30_min_rainfall = col_double(),
  ..   highest_60_min_rainfall = col_double(),
  ..   highest_120_min_rainfall = col_double(),
  ..   mean_temperature = col_double(),
  ..   maximum_temperature = col_double(),
  ..   minimum_temperature = col_double(),
  ..   mean_wind_speed = col_double(),
  ..   max_wind_speed = col_double()
  .. )
 - attr(*, "problems")=<externalptr> 
Note

The data types are all correct.

Checking for missing values

sum(is.na(data))
[1] 1876509
Note

No missing values were found.

3 Static Data Visualisation

Note

4 Interactive Data Visualisation

Note
Note

5 Conclusion